AITopics | better workshop

Collaborating Authors

better workshop

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploring Adaptive Structure Learning for Heterophilic Graphs

Kaushik, Garv

arXiv.org Artificial IntelligenceJul-30-2025

Graph Convolutional Networks (GCNs) gained traction for graph representation learning, with recent attention on improving performance on heterophilic graphs for various real-world applications. The localized feature aggregation in a typical message-passing paradigm hinders the capturing of long-range dependencies between non-local nodes of the same class. We propose structure learning to rewire edges in shallow GCNs itself to avoid performance degradation in downstream discriminative tasks due to oversmoothing. Parameterizing the adjacency matrix to learn connections between non-local nodes and extend the hop span of shallow GCNs facilitates the capturing of long-range dependencies. However, our method is not generalizable across heterophilic graphs and performs inconsistently on node classification task contingent to the graph structure.

artificial intelligence, learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2507.21191

Country: North America > United States (0.15)

Genre: Research Report (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.64)

Add feedback

Last Layer Empirical Bayes

Villecroze, Valentin, Wang, Yixin, Loaiza-Ganem, Gabriel

arXiv.org Machine LearningMay-23-2025

The task of quantifying the inherent uncertainty associated with neural network predictions is a key challenge in artificial intelligence. Bayesian neural networks (BNNs) and deep ensembles are among the most prominent approaches to tackle this task. Both approaches produce predictions by computing an expectation of neural network outputs over some distribution on the corresponding weights; this distribution is given by the posterior in the case of BNNs, and by a mixture of point masses for ensembles. Inspired by recent work showing that the distribution used by ensembles can be understood as a posterior corresponding to a learned data-dependent prior, we propose last layer empirical Bayes (LLEB). LLEB instantiates a learnable prior as a normalizing flow, which is then trained to maximize the evidence lower bound; to retain tractability we use the flow only on the last layer. We show why LLEB is well motivated, and how it interpolates between standard BNNs and ensembles in terms of the strength of the prior that they use. LLEB performs on par with existing approaches, highlighting that empirical Bayes is a promising direction for future research in uncertainty quantification.

artificial intelligence, international conference, machine learning, (12 more...)

arXiv.org Machine Learning

2505.15888

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > Michigan (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Possibility for Proactive Anomaly Detection

Jeon, Jinsung, Park, Jaehyeon, Park, Sewon, Choi, Jeongwhan, Kim, Minjung, Park, Noseong

arXiv.org Artificial IntelligenceApr-17-2025

Time-series anomaly detection, which detects errors and failures in a workflow, is one of the most important topics in real-world applications. The purpose of time-series anomaly detection is to reduce potential damages or losses. However, existing anomaly detection models detect anomalies through the error between the model output and the ground truth (observed) value, which makes them impractical. In this work, we present a \textit{proactive} approach for time-series anomaly detection based on a time-series forecasting model specialized for anomaly detection and a data-driven anomaly detection model. Our proactive approach establishes an anomaly threshold from training data with a data-driven anomaly detection model, and anomalies are subsequently detected by identifying predicted values that exceed the anomaly threshold. In addition, we extensively evaluated the model using four anomaly detection benchmarks and analyzed both predictable and unpredictable anomalies. We attached the source code as supplementary material.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2504.11623

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Industry: Information Technology (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Modeling speech emotion with label variance and analyzing performance across speakers and unseen acoustic conditions

Mitra, Vikramjit, Romana, Amrit, Tran, Dung T., Azemi, Erdrin

arXiv.org Artificial IntelligenceMar-24-2025

Spontaneous speech emotion data usually contain perceptual grades where graders assign emotion score after listening to the speech files. Such perceptual grades introduce uncertainty in labels due to grader opinion variation. Grader variation is addressed by using consensus grades as groundtruth, where the emotion with the highest vote is selected. Consensus grades fail to consider ambiguous instances where a speech sample may contain multiple emotions, as captured through grader opinion uncertainty. We demonstrate that using the probability density function of the emotion grades as targets instead of the commonly used consensus grades, provide better performance on benchmark evaluation sets compared to results reported in the literature. We show that a saliency driven foundation model (FM) representation selection helps to train a state-of-the-art speech emotion model for both dimensional and categorical emotion recognition. Comparing representations obtained from different FMs, we observed that focusing on overall test-set performance can be deceiving, as it fails to reveal the models generalization capacity across speakers and gender. We demonstrate that performance evaluation across multiple test-sets and performance analysis across gender and speakers are useful in assessing usefulness of emotion models. Finally, we demonstrate that label uncertainty and data-skew pose a challenge to model evaluation, where instead of using the best hypothesis, it is useful to consider the 2- or 3-best hypotheses.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.22711

Country: North America > United States > California > Santa Clara County > Cupertino (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.66)

Add feedback

On the Limits of Applying Graph Transformers for Brain Connectome Classification

Lara-Rangel, Jose, Heinbaugh, Clare

arXiv.org Artificial IntelligenceMar-20-2025

However, it did not produce improvements in performance or other notable benefits, see Appendix 1. For the Exphormer, we experimented with different numbers of layers, dropout rates for the network and attention mechanism, and numbers of attention heads. The final configuration used dropout probability of 0.1, attention dropout of 0.3, 2 layers, and 4 attention heads. All experiments used learning rate decay starting at 0.001, decaying by 1e 5, over a total of 100 epochs with 5 warmup epochs. We used three different seeds for both the Exphormer and ResidualGCN and assessed the alignment with the results in Said et al. (2023), which only included one run for each experiment. Apart from evaluating performance, we investigated potential advantages of using attention-based models. Our hypothesis was that the attention mechanism could enhance robustness to data noise, particularly in scenarios where certain graph structure components, such as edges, are missing. To verify that the graph structure, nodes and edges taken together, convey meaningful information for prediction, it is important to compare models under noisy or incomplete data settings. We simulate noisy incomplete data by removing edges based on a pre-specified probability of edge removal.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2503.15902

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > New Finding (0.46)

Industry:

Health & Medicine > Therapeutic Area > Neurology (0.68)
Health & Medicine > Health Care Technology (0.47)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Know Thy Judge: On the Robustness Meta-Evaluation of LLM Safety Judges

Eiras, Francisco, Zemour, Eliott, Lin, Eric, Mugunthan, Vaikkunth

arXiv.org Artificial IntelligenceMar-6-2025

Large Language Model (LLM) based judges form the underpinnings of key safety evaluation processes such as offline benchmarking, automated red-teaming, and online guardrailing. This widespread requirement raises the crucial question: can we trust the evaluations of these evaluators? In this paper, we highlight two critical challenges that are typically overlooked: (i) evaluations in the wild where factors like prompt sensitivity and distribution shifts can affect performance and (ii) adversarial attacks that target the judge. We highlight the importance of these through a study of commonly used safety judges, showing that small changes such as the style of the model output can lead to jumps of up to 0.24 in the false negative rate on the same dataset, whereas adversarial attacks on the model generation can fool some judges into misclassifying 100% of harmful generations as safe ones. These findings reveal gaps in commonly used meta-evaluation benchmarks and weaknesses in the robustness of current LLM judges, indicating that low attack success under certain judges could create a false sense of security. Well-known jailbreak attacks on widely used Large Language Models (LLMs) such as ChatGPT have raised concerns about the robustness of these systems to safety violations. As a result, organizations deploying them typically rely on a two-pronged approach to safety: 1) offline benchmarking and red-teaming (Mazeika et al., 2024; Perez et al., 2022; Ganguli et al., 2022), and 2) online guardrails designed to minimize the risk from attacks (Mu et al., 2024; Manczak et al., 2024; Neill et al., 2024).

better workshop, dataset, evaluation, (15 more...)

arXiv.org Artificial Intelligence

2503.04474

Country: North America > United States > Florida > Miami-Dade County > Miami (0.04)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback